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A JavaScript program in a Web page can clear the page content including the program itself 
and generate new content. The program can generate exactly the same content including 
the program itself. This means that a Web page can reproduce itself by JavaScript program 
that is included in the page. Although exact reproduction is useless, inexact reproduction, 
which transform part of the content, is usable for more practical purpose. For example, Web 
pages that change its view from outline mode to d ... 
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Among the HTML elements, HTML tables [RHJ98] encapsulate hierarchically structured data 
(hierarchical data in short) in a tabular structure. HTML tables do not come with a rigid 
schema and almost any forms of two-dimensional tables are acceptable according to the 
HTML grammar. This relaxation complicates the process of retrieving hierarchical data from 
HTML tables. In this paper, we propose an automated approach for retrieving hierarchical 
data from HTML tables. The proposed approach constr ... 
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Dennis J. Bouvier 

October 1995 ACM SIGAPP Applied Computing Review, Volume 3 issue 2 
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In the brief history of the World Wide Web (WWW), much has changed. Millions of web 
pages have been published in a relatively short time. Next to the Web content, the one of 
the most dynamic aspects of the WWW is the development of HyperText Markup Language 
(HTML). This paper explores the various versions of HTML and gives a status report on 
HTML standards development. A discussion of possible future trends is also included. 
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Ada pting content to mobile devices: DOM-based content extraction of HTML I I 

documents 

Suhit Gupta, Gail Kaiser, David Neistadt, Peter Grimm 

May 2003 Proceedings of the twelfth international conference on World Wide Web 
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terms 

Web pages often contain clutter (such as pop-up ads, unnecessary images and extraneous 
links) around the body of an article that distracts a user from actual content. Extraction of 
"useful and relevant" content from web pages has many applications, including cell phone 
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and PDA browsing, speech rendering for the visually impaired, and text summarization. 
Most approaches to removing clutter or making content more readable involve changing 
font size or removing HTML and data components such as imag ... 
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Current techniques employed to aurally render HTML tables often result in output that is 
very difficult for sight-impaired users to understand. This paper proposes TTPML, an XML- 
compliant markup language, which facilitates the generation of prose descriptions of tabular 
information. The markup language enables content creators to specify contextual 
reinforcement of, and linear navigation through, tabular information. The markup language 
may be applied to pre-existing Web content and is reusable ... 
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HTML anchors are often surrounded by text that seems to describe the destination page 
appropriately. The text surrounding a link or the link-context is used for a variety of tasks 
associated with Web information retrieval. These tasks can benefit by identifying 
regularities in the manner in which "good" contexts appear around links. In this paper, we 
describe a framework for conducting such a study. The framework serves as an evaluation 
platform for comparing various link-context derivati ... 
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This paper describes how ILOG, a French software company designing C++ and Java class 
libraries, managed the transition between paper-only documentation and extensive HTML 
online documentation in less than two years. In this paper, we analyze the underlying 
reasons for making this change, describe the technological choices that were made, and 
walk through the various steps of the project from its beginning to final completion. 
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An index connects readers with information. Creating an index for a single book is a time- 
honored craft. Creating an index for a massive library of HTML topics is a modern craft that 
has largely been discarded in favor of robust search engines. The authors show how they 
optimized a single-sourced index for collections of HTML topics, printed books, and PDF 
books. With examples from a recent index of 24,000 entries for 7,000 distinct HTML topics 
also published as 40 different PDF books, the autho ... 
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HTML documents composed of frames can be difficult to write correctly. We demonstrate a 
technique that can be used by authors manually creating HTML documents (or by document 
editors) to verify that complex frame construction exhibits the intended behavior when 
browsed. The method is based on model checking (an automated program verification 
technique), and on temporal logic specifications of expected frames behavior. We show how 
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In this paper, we provide a progress report on the development of technology to support 
the non-visual navigation of complex HTML and XML structures. 
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